Overview

Dataset statistics

Number of variables14
Number of observations2356284
Missing cells11809696
Missing cells (%)35.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory251.7 MiB
Average record size in memory112.0 B

Variable types

CAT7
NUM7

Warnings

country_region_code has a high cardinality: 134 distinct values High cardinality
country_region has a high cardinality: 135 distinct values High cardinality
sub_region_1 has a high cardinality: 1855 distinct values High cardinality
sub_region_2 has a high cardinality: 9682 distinct values High cardinality
metro_area has a high cardinality: 65 distinct values High cardinality
iso_3166_2_code has a high cardinality: 2143 distinct values High cardinality
date has a high cardinality: 203 distinct values High cardinality
sub_region_1 has 40330 (1.7%) missing values Missing
sub_region_2 has 401840 (17.1%) missing values Missing
metro_area has 2343224 (99.4%) missing values Missing
iso_3166_2_code has 1942157 (82.4%) missing values Missing
census_fips_code has 1828595 (77.6%) missing values Missing
retail_and_recreation_percent_change_from_baseline has 825720 (35.0%) missing values Missing
grocery_and_pharmacy_percent_change_from_baseline has 854377 (36.3%) missing values Missing
parks_percent_change_from_baseline has 1210576 (51.4%) missing values Missing
transit_stations_percent_change_from_baseline has 1130892 (48.0%) missing values Missing
workplaces_percent_change_from_baseline has 109641 (4.7%) missing values Missing
residential_percent_change_from_baseline has 1120763 (47.6%) missing values Missing
metro_area is uniformly distributed Uniform
retail_and_recreation_percent_change_from_baseline has 25590 (1.1%) zeros Zeros
grocery_and_pharmacy_percent_change_from_baseline has 40653 (1.7%) zeros Zeros
workplaces_percent_change_from_baseline has 40157 (1.7%) zeros Zeros
residential_percent_change_from_baseline has 65648 (2.8%) zeros Zeros

Reproduction

Analysis started2020-10-03 05:35:35.965433
Analysis finished2020-10-03 05:37:07.701105
Duration1 minute and 31.74 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

country_region_code
Categorical

HIGH CARDINALITY

Distinct134
Distinct (%)< 0.1%
Missing1581
Missing (%)0.1%
Memory size18.0 MiB
US
538042 
BR
372159 
IN
134996 
TR
 
106829
GB
 
84876
Other values (129)
1117801 
ValueCountFrequency (%) 
US53804222.8%
 
BR37215915.8%
 
IN1349965.7%
 
TR1068294.5%
 
GB848763.6%
 
AR837883.6%
 
PL772983.3%
 
NL717213.0%
 
CO663682.8%
 
AU548192.3%
 
Other values (124)76380732.4%
 
2020-10-03T15:37:07.901164image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-03T15:37:08.098760image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length2
Mean length2.000670972
Min length2

country_region
Categorical

HIGH CARDINALITY

Distinct135
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
United States
538042 
Brazil
372159 
India
134996 
Turkey
 
106829
United Kingdom
 
84876
Other values (130)
1119382 
ValueCountFrequency (%) 
United States53804222.8%
 
Brazil37215915.8%
 
India1349965.7%
 
Turkey1068294.5%
 
United Kingdom848763.6%
 
Argentina837883.6%
 
Poland772983.3%
 
Netherlands717213.0%
 
Colombia663682.8%
 
Australia548192.3%
 
Other values (125)76538832.5%
 
2020-10-03T15:37:08.299058image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-03T15:37:08.505178image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length22
Median length7
Mean length8.561637731
Min length4

sub_region_1
Categorical

HIGH CARDINALITY
MISSING

Distinct1855
Distinct (%)0.1%
Missing40330
Missing (%)1.7%
Memory size18.0 MiB
State of São Paulo
 
70288
State of Minas Gerais
 
47607
Texas
 
40059
State of Rio Grande do Sul
 
29212
Georgia
 
28321
Other values (1850)
2100467 
ValueCountFrequency (%) 
State of São Paulo702883.0%
 
State of Minas Gerais476072.0%
 
Texas400591.7%
 
State of Rio Grande do Sul292121.2%
 
Georgia283211.2%
 
State of Paraná279641.2%
 
State of Bahia258731.1%
 
Virginia251561.1%
 
Buenos Aires Province251141.1%
 
State of Santa Catarina234611.0%
 
Other values (1845)197289983.7%
 
(Missing)403301.7%
 
2020-10-03T15:37:08.736902image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique5 ?
Unique (%)< 0.1%
2020-10-03T15:37:08.972393image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length74
Median length11
Mean length12.22135999
Min length3

sub_region_2
Categorical

HIGH CARDINALITY
MISSING

Distinct9682
Distinct (%)0.5%
Missing401840
Missing (%)17.1%
Memory size18.0 MiB
Washington County
 
5623
Jefferson County
 
4664
Franklin County
 
4382
Jackson County
 
4019
Lincoln County
 
3848
Other values (9677)
1931908 
ValueCountFrequency (%) 
Washington County56230.2%
 
Jefferson County46640.2%
 
Franklin County43820.2%
 
Jackson County40190.2%
 
Lincoln County38480.2%
 
Madison County36880.2%
 
Montgomery County34190.1%
 
Marion County32450.1%
 
Union County31280.1%
 
Monroe County31210.1%
 
Other values (9672)191530781.3%
 
(Missing)40184017.1%
 
2020-10-03T15:37:09.244774image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique7 ?
Unique (%)< 0.1%
2020-10-03T15:37:09.481080image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length56
Median length11
Mean length11.40147453
Min length2

metro_area
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct65
Distinct (%)0.5%
Missing2343224
Missing (%)99.4%
Memory size18.0 MiB
Lahore Metropolitan Area
 
203
Montevideo Metropolitan Area
 
203
Kuwait City Metropolitan Area
 
203
Doha Metropolitan Area
 
203
Bamako Metropolitan Area
 
203
Other values (60)
12045 
ValueCountFrequency (%) 
Lahore Metropolitan Area203< 0.1%
 
Montevideo Metropolitan Area203< 0.1%
 
Kuwait City Metropolitan Area203< 0.1%
 
Doha Metropolitan Area203< 0.1%
 
Bamako Metropolitan Area203< 0.1%
 
Ufa Metropolitan Area203< 0.1%
 
Marrakesh Metropolitan Area203< 0.1%
 
Rostov-on-Don Metropolitan Area203< 0.1%
 
Baguio Metropolitan Area203< 0.1%
 
Quetta Metropolitan Area203< 0.1%
 
Other values (55)110300.5%
 
(Missing)234322499.4%
 
2020-10-03T15:37:09.720560image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-03T15:37:09.915371image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length34
Median length3
Mean length3.125661423
Min length3

iso_3166_2_code
Categorical

HIGH CARDINALITY
MISSING

Distinct2143
Distinct (%)0.5%
Missing1942157
Missing (%)82.4%
Memory size18.0 MiB
RO-BT
 
203
CI-09
 
203
LT-TE
 
203
GB-BGE
 
203
GB-HRY
 
203
Other values (2138)
413112 
ValueCountFrequency (%) 
RO-BT203< 0.1%
 
CI-09203< 0.1%
 
LT-TE203< 0.1%
 
GB-BGE203< 0.1%
 
GB-HRY203< 0.1%
 
GT-GU203< 0.1%
 
IT-78203< 0.1%
 
NI-CA203< 0.1%
 
GB-STT203< 0.1%
 
PL-WN203< 0.1%
 
Other values (2133)41209717.5%
 
(Missing)194215782.4%
 
2020-10-03T15:37:10.132535image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique5 ?
Unique (%)< 0.1%
2020-10-03T15:37:10.347383image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length3
Mean length3.379635901
Min length3

census_fips_code
Real number (ℝ≥0)

MISSING

Distinct2833
Distinct (%)0.5%
Missing1828595
Missing (%)77.6%
Infinite0
Infinite (%)0.0%
Mean30338.94115
Minimum1001
Maximum56045
Zeros0
Zeros (%)0.0%
Memory size18.0 MiB
2020-10-03T15:37:10.528955image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile5099
Q118109
median29105
Q345057
95-th percentile53075
Maximum56045
Range55044
Interquartile range (IQR)26948

Descriptive statistics

Standard deviation15296.7533
Coefficient of variation (CV)0.5041953582
Kurtosis-1.128529214
Mean30338.94115
Median Absolute Deviation (MAD)12000
Skewness-0.06710196046
Sum1.600952552e+10
Variance233990661.6
MonotocityNot monotonic
2020-10-03T15:37:10.742822image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
25011203< 0.1%
 
25025203< 0.1%
 
25023203< 0.1%
 
25021203< 0.1%
 
25017203< 0.1%
 
25015203< 0.1%
 
25013203< 0.1%
 
25009203< 0.1%
 
26005203< 0.1%
 
25005203< 0.1%
 
Other values (2823)52565922.3%
 
(Missing)182859577.6%
 
ValueCountFrequency (%) 
1001203< 0.1%
 
1003203< 0.1%
 
1005203< 0.1%
 
1007203< 0.1%
 
1009203< 0.1%
 
ValueCountFrequency (%) 
56045147< 0.1%
 
56043153< 0.1%
 
56041203< 0.1%
 
56039203< 0.1%
 
56037203< 0.1%
 

date
Categorical

HIGH CARDINALITY

Distinct203
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2020-07-09
 
12896
2020-06-23
 
12893
2020-07-16
 
12891
2020-06-18
 
12891
2020-07-30
 
12891
Other values (198)
2291822 
ValueCountFrequency (%) 
2020-07-09128960.5%
 
2020-06-23128930.5%
 
2020-07-16128910.5%
 
2020-06-18128910.5%
 
2020-07-30128910.5%
 
2020-07-02128910.5%
 
2020-06-24128910.5%
 
2020-06-17128900.5%
 
2020-07-03128900.5%
 
2020-07-07128880.5%
 
Other values (193)222737294.5%
 
2020-10-03T15:37:11.378668image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-03T15:37:11.582461image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10
Distinct422
Distinct (%)< 0.1%
Missing825720
Missing (%)35.0%
Infinite0
Infinite (%)0.0%
Mean-24.40305012
Minimum-100
Maximum545
Zeros25590
Zeros (%)1.1%
Memory size18.0 MiB
2020-10-03T15:37:11.768433image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-100
5-th percentile-79
Q1-48
median-20
Q3-1
95-th percentile18
Maximum545
Range645
Interquartile range (IQR)47

Descriptive statistics

Standard deviation31.56306826
Coefficient of variation (CV)-1.293406689
Kurtosis2.430111479
Mean-24.40305012
Median Absolute Deviation (MAD)22
Skewness0.1635129616
Sum-37350430
Variance996.2272777
MonotocityNot monotonic
2020-10-03T15:37:11.977693image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0255901.1%
 
-2251701.1%
 
-1251401.1%
 
1248231.1%
 
2246211.0%
 
-3245441.0%
 
-4239561.0%
 
-5233271.0%
 
3232231.0%
 
-6226591.0%
 
Other values (412)128751154.6%
 
(Missing)82572035.0%
 
ValueCountFrequency (%) 
-100171< 0.1%
 
-9980< 0.1%
 
-98428< 0.1%
 
-97873< 0.1%
 
-9613920.1%
 
ValueCountFrequency (%) 
5451< 0.1%
 
5331< 0.1%
 
5272< 0.1%
 
5241< 0.1%
 
5131< 0.1%
 
Distinct443
Distinct (%)< 0.1%
Missing854377
Missing (%)36.3%
Infinite0
Infinite (%)0.0%
Mean-5.039780093
Minimum-100
Maximum454
Zeros40653
Zeros (%)1.7%
Memory size18.0 MiB
2020-10-03T15:37:12.182599image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-100
5-th percentile-50
Q1-16
median-2
Q38
95-th percentile29
Maximum454
Range554
Interquartile range (IQR)24

Descriptive statistics

Standard deviation25.18208074
Coefficient of variation (CV)-4.996662608
Kurtosis6.786753458
Mean-5.039780093
Median Absolute Deviation (MAD)12
Skewness0.1762826237
Sum-7569281
Variance634.1371905
MonotocityNot monotonic
2020-10-03T15:37:12.378288image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2409431.7%
 
0406531.7%
 
1403131.7%
 
3399591.7%
 
4388391.6%
 
-1386441.6%
 
-2375161.6%
 
5361651.5%
 
-3351651.5%
 
6347101.5%
 
Other values (433)111900047.5%
 
(Missing)85437736.3%
 
ValueCountFrequency (%) 
-10076< 0.1%
 
-991< 0.1%
 
-9843< 0.1%
 
-97230< 0.1%
 
-96547< 0.1%
 
ValueCountFrequency (%) 
4541< 0.1%
 
4001< 0.1%
 
3871< 0.1%
 
3851< 0.1%
 
3821< 0.1%
 

parks_percent_change_from_baseline
Real number (ℝ)

MISSING

Distinct904
Distinct (%)0.1%
Missing1210576
Missing (%)51.4%
Infinite0
Infinite (%)0.0%
Mean-6.345571472
Minimum-100
Maximum1206
Zeros10462
Zeros (%)0.4%
Memory size18.0 MiB
2020-10-03T15:37:12.582328image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-100
5-th percentile-82
Q1-49
median-16
Q316
95-th percentile113
Maximum1206
Range1306
Interquartile range (IQR)65

Descriptive statistics

Standard deviation65.94798252
Coefficient of variation (CV)-10.39275703
Kurtosis13.63809997
Mean-6.345571472
Median Absolute Deviation (MAD)32
Skewness2.484969709
Sum-7270172
Variance4349.136399
MonotocityNot monotonic
2020-10-03T15:37:12.793050image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
-2105150.4%
 
0104620.4%
 
-1103010.4%
 
-4102430.4%
 
-3102050.4%
 
-7100960.4%
 
2100610.4%
 
-6100530.4%
 
1100230.4%
 
-599370.4%
 
Other values (894)104381244.3%
 
(Missing)121057651.4%
 
ValueCountFrequency (%) 
-10034400.1%
 
-99268< 0.1%
 
-98748< 0.1%
 
-97998< 0.1%
 
-9614020.1%
 
ValueCountFrequency (%) 
12061< 0.1%
 
11871< 0.1%
 
11501< 0.1%
 
11492< 0.1%
 
11461< 0.1%
 
Distinct481
Distinct (%)< 0.1%
Missing1130892
Missing (%)48.0%
Infinite0
Infinite (%)0.0%
Mean-28.74046101
Minimum-100
Maximum497
Zeros14325
Zeros (%)0.6%
Memory size18.0 MiB
2020-10-03T15:37:12.998760image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-100
5-th percentile-77
Q1-52
median-29
Q3-6
95-th percentile18
Maximum497
Range597
Interquartile range (IQR)46

Descriptive statistics

Standard deviation31.57209016
Coefficient of variation (CV)-1.098524138
Kurtosis4.832432845
Mean-28.74046101
Median Absolute Deviation (MAD)23
Skewness0.7771535017
Sum-35218331
Variance996.7968771
MonotocityNot monotonic
2020-10-03T15:37:13.194970image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0143250.6%
 
-35140700.6%
 
-2140490.6%
 
-40140300.6%
 
-33139670.6%
 
-32138660.6%
 
-29138590.6%
 
-36138360.6%
 
-31138180.6%
 
-30138070.6%
 
Other values (471)108576546.1%
 
(Missing)113089248.0%
 
ValueCountFrequency (%) 
-10013860.1%
 
-991< 0.1%
 
-9866< 0.1%
 
-97228< 0.1%
 
-96457< 0.1%
 
ValueCountFrequency (%) 
4971< 0.1%
 
4851< 0.1%
 
4621< 0.1%
 
4611< 0.1%
 
4591< 0.1%
 

workplaces_percent_change_from_baseline
Real number (ℝ)

MISSING
ZEROS

Distinct282
Distinct (%)< 0.1%
Missing109641
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean-21.9506526
Minimum-100
Maximum258
Zeros40157
Zeros (%)1.7%
Memory size18.0 MiB
2020-10-03T15:37:13.407906image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-100
5-th percentile-59
Q1-35
median-22
Q3-5
95-th percentile9
Maximum258
Range358
Interquartile range (IQR)30

Descriptive statistics

Standard deviation21.26001117
Coefficient of variation (CV)-0.9685366338
Kurtosis0.2736910935
Mean-21.9506526
Median Absolute Deviation (MAD)15
Skewness-0.2249839977
Sum-49315280
Variance451.9880751
MonotocityNot monotonic
2020-10-03T15:37:13.614736image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
-24459001.9%
 
-25455241.9%
 
-26453681.9%
 
-27451791.9%
 
-28443141.9%
 
-23442721.9%
 
-22433061.8%
 
-29429891.8%
 
1426501.8%
 
2425081.8%
 
Other values (272)180463376.6%
 
(Missing)1096414.7%
 
ValueCountFrequency (%) 
-1002< 0.1%
 
-952< 0.1%
 
-9411< 0.1%
 
-9321< 0.1%
 
-9269< 0.1%
 
ValueCountFrequency (%) 
2581< 0.1%
 
2481< 0.1%
 
2461< 0.1%
 
2413< 0.1%
 
2391< 0.1%
 

residential_percent_change_from_baseline
Real number (ℝ)

MISSING
ZEROS

Distinct97
Distinct (%)< 0.1%
Missing1120763
Missing (%)47.6%
Infinite0
Infinite (%)0.0%
Mean10.27266392
Minimum-46
Maximum57
Zeros65648
Zeros (%)2.8%
Memory size18.0 MiB
2020-10-03T15:37:13.830228image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-46
5-th percentile-2
Q13
median9
Q316
95-th percentile27
Maximum57
Range103
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.007779345
Coefficient of variation (CV)0.8768688838
Kurtosis0.1873143663
Mean10.27266392
Median Absolute Deviation (MAD)6
Skewness0.6592314183
Sum12692092
Variance81.14008873
MonotocityNot monotonic
2020-10-03T15:37:14.037208image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0656482.8%
 
1587162.5%
 
10520802.2%
 
9513122.2%
 
11511642.2%
 
-1504372.1%
 
8496682.1%
 
7492542.1%
 
12492302.1%
 
6492302.1%
 
Other values (87)70878230.1%
 
(Missing)112076347.6%
 
ValueCountFrequency (%) 
-462< 0.1%
 
-451< 0.1%
 
-431< 0.1%
 
-401< 0.1%
 
-391< 0.1%
 
ValueCountFrequency (%) 
571< 0.1%
 
562< 0.1%
 
555< 0.1%
 
5412< 0.1%
 
5313< 0.1%
 

Interactions

2020-10-03T15:36:33.666852image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:33.856139image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.031846image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.227204image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.410601image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.592184image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.777561image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:34.956893image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:35.206464image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:35.391879image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:35.561013image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:35.785380image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:35.964307image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:36.138875image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:36.300142image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:36.493352image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:36.688264image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:36.884674image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:37.101141image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:37.359933image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:37.573681image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:37.781990image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:37.968579image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:38.123892image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:38.290100image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:38.456198image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:38.687391image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:38.890230image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:39.079805image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:39.255384image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:39.421264image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:39.629946image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:39.827118image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:40.034118image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:40.262683image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:40.463417image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:40.658728image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:40.851354image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:41.064755image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:41.255932image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:41.493947image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:41.698354image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:41.878328image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:42.061215image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:42.255876image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:42.451013image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:42.646530image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:43.275736image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:43.457932image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2020-10-03T15:37:14.207090image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-03T15:37:14.509352image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-03T15:37:14.795830image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-03T15:37:15.103061image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-03T15:36:48.091639image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:36:52.645420image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:37:03.201103image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-10-03T15:37:05.057876image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Sample

First rows

country_region_codecountry_regionsub_region_1sub_region_2metro_areaiso_3166_2_codecensus_fips_codedateretail_and_recreation_percent_change_from_baselinegrocery_and_pharmacy_percent_change_from_baselineparks_percent_change_from_baselinetransit_stations_percent_change_from_baselineworkplaces_percent_change_from_baselineresidential_percent_change_from_baseline
0AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-150.04.05.00.02.01.0
1AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-161.04.04.01.02.01.0
2AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-17-1.01.05.01.02.01.0
3AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-18-2.01.05.00.02.01.0
4AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-19-2.00.04.0-1.02.01.0
5AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-20-2.01.06.01.01.01.0
6AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-21-3.02.06.00.0-1.01.0
7AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-22-2.02.04.0-2.03.01.0
8AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-23-1.03.03.0-1.04.01.0
9AEUnited Arab EmiratesNaNNaNNaNNaNNaN2020-02-24-3.00.05.0-1.03.01.0

Last rows

country_region_codecountry_regionsub_region_1sub_region_2metro_areaiso_3166_2_codecensus_fips_codedateretail_and_recreation_percent_change_from_baselinegrocery_and_pharmacy_percent_change_from_baselineparks_percent_change_from_baselinetransit_stations_percent_change_from_baselineworkplaces_percent_change_from_baselineresidential_percent_change_from_baseline
2356274ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-24NaNNaNNaNNaN-4.0NaN
2356275ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-25NaNNaNNaNNaN7.0NaN
2356276ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-26NaNNaNNaNNaN1.0NaN
2356277ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-27NaNNaNNaNNaN0.0NaN
2356278ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-28NaNNaNNaNNaN-3.0NaN
2356279ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-08-31NaNNaNNaNNaN-1.0NaN
2356280ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-09-01NaNNaNNaNNaN-2.0NaN
2356281ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-09-02NaNNaNNaNNaN5.0NaN
2356282ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-09-03NaNNaNNaNNaN6.0NaN
2356283ZWZimbabweMidlands ProvinceKwekweNaNNaNNaN2020-09-04NaNNaNNaNNaN2.0NaN